Goals

The goal of this tutorial is to help students gain familiarity with differential gene expression analysis using RNA-Seq data. To do this, we will work through a recent tutorial article (Law et al. 2016).

Please refer to the original article for the logic underlying each analytical step. What I’ve tried to do in this document is provide the code-chunks in a more simply executable format (an RMarkdown document), and provide some more explanatory comments as to what each code chunk is doing. Note that the original author’s code is available as a BioConductor Workflow.

In the lecture, I’ll try to present both the logic and give an overview of what the code is doing.

Setup

Load R Packages

This first chunk loads R packages available from CRAN:

Next, we load additional packages from BioConductor. Note that BioConductor is a repository for many bioinformatics-focussed R packages, while CRAN is more general.

# If you don't have the packages installed, uncomment and run this:
# source("https://bioconductor.org/biocLite.R")
# biocLite(c("edgeR", "limma", "Mus.musculus", "Glimma", "gplots"))

# Load Bioconductor libraries 
library(limma) 
library(edgeR)
library(Mus.musculus)
library(Glimma)
library(gplots)

Data packaging

Reading in count data

First, we’ll download and then test reading in the raw read count files.

# This code will download and extract the raw count data
url <- "https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE63310&format=file"
utils::download.file(url, destfile = "GSE63310_RAW.tar", mode = "wb") 
utils::untar("GSE63310_RAW.tar", exdir = ".")
# Note how we generate a list of pathed files by concatenating 
#  the directory, filename, and extension
files <- c("GSM1545535_10_6_5_11.txt", 
           "GSM1545536_9_6_5_11.txt", 
           "GSM1545538_purep53.txt", 
           "GSM1545539_JMS8-2.txt", 
           "GSM1545540_JMS8-3.txt", 
           "GSM1545541_JMS8-4.txt", 
           "GSM1545542_JMS8-5.txt", 
           "GSM1545544_JMS9-P7c.txt", 
           "GSM1545545_JMS9-P8c.txt")
compressed_files <- str_c(files, ".gz")
# This just demonstrates that file reading works by looking at the first
#  five lines of the first file
read.delim(compressed_files[1], nrow = 5)
##    EntrezID GeneLength Count
## 1    497097       3634     1
## 2 100503874       3259     0
## 3 100038431       1634     0
## 4     19888       9747     0
## 5     20671       3130     1

Now, instead of reading the count files directly, we’ll use the readDGE function from the edgeR package to read things into a more analysis-ready format:

x <- readDGE(compressed_files, columns = c(1,3))
class(x)
## [1] "DGEList"
## attr(,"package")
## [1] "edgeR"
dim(x)
## [1] 27179     9

Note that the object containing our count data is not a simple data frame - it’s a DGEList object. We can still access the sample table and count table by accessing the object (using x$samples and x$counts).

Organising sample information

Set up the experimental design data:

# Extract sample names from the names of the input text files
samplenames <- substring(colnames(x), 12, nchar(colnames(x)) - 4) 
samplenames
## [1] "10_6_5_11" "9_6_5_11"  "purep53"   "JMS8-2"    "JMS8-3"    "JMS8-4"   
## [7] "JMS8-5"    "JMS9-P7c"  "JMS9-P8c"
# Set up the column names, and manually label the experimental groups and lanes
colnames(x) <- samplenames
group <- as.factor(c("LP", "ML", "Basal", "Basal", "ML", "LP", "Basal", "ML", "LP")) 
x$samples$group <- group 
lane <- as.factor(rep(c("L004","L006","L008"), c(3,4,2))) 
x$samples$lane <- lane 

# Now we can re-inspect the sample data that we've re-labelled
x$samples
##                                 files group lib.size norm.factors lane
## 10_6_5_11 GSM1545535_10_6_5_11.txt.gz    LP 32863052            1 L004
## 9_6_5_11   GSM1545536_9_6_5_11.txt.gz    ML 35335491            1 L004
## purep53     GSM1545538_purep53.txt.gz Basal 57160817            1 L004
## JMS8-2       GSM1545539_JMS8-2.txt.gz Basal 51368625            1 L006
## JMS8-3       GSM1545540_JMS8-3.txt.gz    ML 75795034            1 L006
## JMS8-4       GSM1545541_JMS8-4.txt.gz    LP 60517657            1 L006
## JMS8-5       GSM1545542_JMS8-5.txt.gz Basal 55086324            1 L006
## JMS9-P7c   GSM1545544_JMS9-P7c.txt.gz    ML 21311068            1 L008
## JMS9-P8c   GSM1545545_JMS9-P8c.txt.gz    LP 19958838            1 L008

Make an extra table of the sample data:

# Note that this chunk uses 'tidyverse' functions and syntax
# Read as "Make a data frame, add the sample name, remove the file name, then sort by group and library size"
sample_data.df <- as.data.frame(x$samples)
sample_data.df %>%
  rownames_to_column("sample") %>%
  dplyr::select(-files) %>%
  arrange(group, lib.size) %>%
  kable(caption = "Experimental Design Table")
Experimental Design Table
sample group lib.size norm.factors lane
JMS8-2 Basal 51368625 1 L006
JMS8-5 Basal 55086324 1 L006
purep53 Basal 57160817 1 L004
JMS9-P8c LP 19958838 1 L008
10_6_5_11 LP 32863052 1 L004
JMS8-4 LP 60517657 1 L006
JMS9-P7c ML 21311068 1 L008
9_6_5_11 ML 35335491 1 L004
JMS8-3 ML 75795034 1 L006

Organising gene annotations

Retrieve gene annotation information. This uses the BioConductor package Mus.musculus, which contains many useful mouse gene identifiers.

# Extract the gene IDs from the DGElist object
geneid <- rownames(x) 
# Use the BioConductor package to retrieve more useful gene names
genes <- select(Mus.musculus, keys = geneid, 
                columns = c("SYMBOL", "TXCHROM"), 
                keytype = "ENTREZID")

# Check the dimensions of the returned object, and inspect the first few rows
dim(genes)
## [1] 27220     3
head(genes)
##    ENTREZID  SYMBOL TXCHROM
## 1    497097    Xkr4    chr1
## 2 100503874 Gm19938    <NA>
## 3 100038431 Gm10568    <NA>
## 4     19888     Rp1    chr1
## 5     20671   Sox17    chr1
## 6     27395  Mrpl15    chr1
# Since there are multiple identifiers for some genes, 
#  keep only the first occurrence of each gene
genes <- genes[!duplicated(genes$ENTREZID),]

# Now add the gene names back to our DGEList object
x$genes <- genes
x
## An object of class "DGEList"
## $samples
##                                 files group lib.size norm.factors lane
## 10_6_5_11 GSM1545535_10_6_5_11.txt.gz    LP 32863052            1 L004
## 9_6_5_11   GSM1545536_9_6_5_11.txt.gz    ML 35335491            1 L004
## purep53     GSM1545538_purep53.txt.gz Basal 57160817            1 L004
## JMS8-2       GSM1545539_JMS8-2.txt.gz Basal 51368625            1 L006
## JMS8-3       GSM1545540_JMS8-3.txt.gz    ML 75795034            1 L006
## JMS8-4       GSM1545541_JMS8-4.txt.gz    LP 60517657            1 L006
## JMS8-5       GSM1545542_JMS8-5.txt.gz Basal 55086324            1 L006
## JMS9-P7c   GSM1545544_JMS9-P7c.txt.gz    ML 21311068            1 L008
## JMS9-P8c   GSM1545545_JMS9-P8c.txt.gz    LP 19958838            1 L008
## 
## $counts
##            Samples
## Tags        10_6_5_11 9_6_5_11 purep53 JMS8-2 JMS8-3 JMS8-4 JMS8-5
##   497097            1        2     342    526      3      3    535
##   100503874         0        0       5      6      0      0      5
##   100038431         0        0       0      0      0      0      1
##   19888             0        1       0      0     17      2      0
##   20671             1        1      76     40     33     14     98
##            Samples
## Tags        JMS9-P7c JMS9-P8c
##   497097           2        0
##   100503874        0        0
##   100038431        0        0
##   19888            1        0
##   20671           18        8
## 27174 more rows ...
## 
## $genes
##    ENTREZID  SYMBOL TXCHROM
## 1    497097    Xkr4    chr1
## 2 100503874 Gm19938    <NA>
## 3 100038431 Gm10568    <NA>
## 4     19888     Rp1    chr1
## 5     20671   Sox17    chr1
## 27174 more rows ...

What we’ve done so far:

  • Read in all the raw count data
  • Set up the experimental design details
  • Retrieved human-readable gene names

Data pre-processing

Transformations from the raw-scale

Convert count data to CPM and log-CPM to normalize the counts between different libraries.

# Check the documentation for this function to see what's going on with ?cpm

# This generates a matrix of expression values in CPM
cpm <- cpm(x) 
# ... and a matrix of CPM for log2+0.25 counts
# Note that we add 0.25 to the raw values to avoid taking the log of 0
lcpm <- cpm(x, log = TRUE)

Removing genes that are lowly expressed

Check for genes with no count data, and remove other lowly-expressed genes:

# Read as: "Make me a table by counting how many rows there are where the sum of the row for all 9 samples is 0"
table(rowSums(x$counts == 0) == 9)
## 
## FALSE  TRUE 
## 22026  5153
# Read as: "Find me all the rows (i.e. genes) where the CPM is >1 in at least 3 of the 9 samples"
keep.exprs <- rowSums(cpm > 1) >= 3
dim(x)
## [1] 27179     9
x <- x[keep.exprs, , keep.lib.sizes = FALSE]
dim(x)
## [1] 14165     9

Note that we’ve now subset down from our original list of 27,179 genes to a more manageable 14,165.

Plot count data for raw and unfiltered data sets. Note that the plots below use the ‘base R’ plotting system, which can be a bit difficult to read. Essentially, we’re plotting two density distributions.

nsamples <- ncol(x)
col <- brewer.pal(nsamples, "Paired")
par(mfrow = c(1,2))
plot(density(lcpm[,1]), 
     col = col[1], 
     lwd = 2, 
     ylim = c(0,0.21), las = 2,
     main = "", xlab = "")
title(main = "A. Raw data", xlab = "Log-cpm")
abline(v = 0, lty = 3)
for (i in 2:nsamples) {
 den <- density(lcpm[,i])
 lines(den$x, den$y, col = col[i], lwd = 2)
}
legend("topright", samplenames, text.col = col, bty = "n")
lcpm <- cpm(x, log = TRUE)
plot(density(lcpm[,1]), 
     col = col[1], lwd = 2, 
     ylim = c(0,0.21), las = 2,
     main = "", xlab = "")
title(main = "B. Filtered data", xlab = "Log-cpm")
abline(v = 0, lty = 3)
for (i in 2:nsamples) {
   den <- density(lcpm[,i])
   lines(den$x, den$y, col = col[i], lwd = 2)
}
legend("topright", samplenames, text.col = col, bty = "n")

Normalising gene expression distributions

Calculate normalization factors. This scales the expression values for each sample using the ‘trimmed mean of M-values’ (TMM) method:

# Calculate and then display the library normalization factors
x <- calcNormFactors(x, method = "TMM")
x$samples$norm.factors
## [1] 0.8957309 1.0349196 1.0439552 1.0405040 1.0323599 0.9223424 0.9836603
## [8] 1.0827381 0.9792607
# Make a copy so we can show a better comparison
# This scales the first sample to 5%, and the second sample to 500%, so the 
#  effect of the normalization is more apparent
x2 <- x
x2$samples$norm.factors <- 1
x2$counts[,1] <- ceiling(x2$counts[,1]*0.05)
x2$counts[,2] <- x2$counts[,2]*5

Plot the effect of normalization for the tweaked data set:

par(mfrow = c(1,2))
lcpm <- cpm(x2, log = TRUE)
boxplot(lcpm, las = 2, col = col, main = "")
title(main = "A. Example: Unnormalised data", ylab = "Log-cpm")
x2 <- calcNormFactors(x2)
x2$samples$norm.factors
## [1] 0.05472223 6.13059440 1.22927355 1.17051887 1.21487709 1.05622968
## [7] 1.14587663 1.26129350 1.11702264
## [1] 0.0547 6.1306 1.2293 1.1705 1.2149 1.0562 1.1459 1.2613 1.1170

lcpm <- cpm(x2, log = TRUE)
boxplot(lcpm, las = 2, col = col, main = "")
title(main = "B. Example: Normalised data", ylab = "Log-cpm")

Unsupervised clustering of samples

Plot unsupervised clustering of samples - note that the first two dimensions are useful for explaining the sample grouping, while the 3rd and 4th dimensions seem to correlate with the sequencing lane.

lcpm <- cpm(x, log = TRUE)
par(mfrow = c(1,2))
col.group <- group
levels(col.group) <- brewer.pal(nlevels(col.group), "Set1")
col.group <- as.character(col.group)
col.lane <- lane
levels(col.lane) <- brewer.pal(nlevels(col.lane), "Set2")
col.lane <- as.character(col.lane)
plotMDS(lcpm, labels = group, col = col.group)
title(main = "A. Sample groups")

plotMDS(lcpm, labels = lane, col = col.lane, dim = c(3,4))
title(main = "B. Sequencing lanes")

# This chunk will not be executed when rendering - run it manually to launch 
#  the interactive plot in a browser
glMDSPlot(lcpm, labels = paste(group, lane, sep = "_"), 
          groups = x$samples[,c(2,5)],
          launch = TRUE)

Differential expression analysis

Creating a design matrix and contrasts

Set up the experimental design matrix and contrast matrix:

# Design matrix - this specifies the formula for the linear model used
design <- model.matrix(~0 + group + lane)
colnames(design) <- gsub("group", "", colnames(design))
design
##   Basal LP ML laneL006 laneL008
## 1     0  1  0        0        0
## 2     0  0  1        0        0
## 3     1  0  0        0        0
## 4     1  0  0        1        0
## 5     0  0  1        1        0
## 6     0  1  0        1        0
## 7     1  0  0        1        0
## 8     0  0  1        0        1
## 9     0  1  0        0        1
## attr(,"assign")
## [1] 1 1 1 2 2
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## attr(,"contrasts")$lane
## [1] "contr.treatment"
# Contrast matrix - this sets up the comparisons we want to make
contr.matrix <- makeContrasts(
   BasalvsLP = Basal - LP,
   BasalvsML = Basal - ML, 
   LPvsML = LP - ML,
   levels = colnames(design))
contr.matrix
##           Contrasts
## Levels     BasalvsLP BasalvsML LPvsML
##   Basal            1         1      0
##   LP              -1         0      1
##   ML               0        -1     -1
##   laneL006         0         0      0
##   laneL008         0         0      0

Removing heteroscedascity from count data

Generate voom plots to remove heteroscedascity from the data:

v <- voom(x, design, plot = TRUE)
v
## An object of class "EList"
## $genes
##    ENTREZID SYMBOL TXCHROM
## 1    497097   Xkr4    chr1
## 6     27395 Mrpl15    chr1
## 7     18777 Lypla1    chr1
## 9     21399  Tcea1    chr1
## 10    58175  Rgs20    chr1
## 14160 more rows ...
## 
## $targets
##                                 files group lib.size norm.factors lane
## 10_6_5_11 GSM1545535_10_6_5_11.txt.gz    LP 29409426    0.8957309 L004
## 9_6_5_11   GSM1545536_9_6_5_11.txt.gz    ML 36528591    1.0349196 L004
## purep53     GSM1545538_purep53.txt.gz Basal 59598629    1.0439552 L004
## JMS8-2       GSM1545539_JMS8-2.txt.gz Basal 53382070    1.0405040 L006
## JMS8-3       GSM1545540_JMS8-3.txt.gz    ML 78175314    1.0323599 L006
## JMS8-4       GSM1545541_JMS8-4.txt.gz    LP 55762781    0.9223424 L006
## JMS8-5       GSM1545542_JMS8-5.txt.gz Basal 54115150    0.9836603 L006
## JMS9-P7c   GSM1545544_JMS9-P7c.txt.gz    ML 23043111    1.0827381 L008
## JMS9-P8c   GSM1545545_JMS9-P8c.txt.gz    LP 19525423    0.9792607 L008
## 
## $E
##         Samples
## Tags     10_6_5_11  9_6_5_11   purep53    JMS8-2    JMS8-3    JMS8-4
##   497097 -4.293244 -3.869026  2.522753  3.302006 -4.481286 -3.993876
##   27395   3.875010  4.400568  4.521172  4.570624  4.322845  3.786547
##   18777   4.707695  5.559334  5.400569  5.171235  5.627798  5.081794
##   21399   4.784462  4.741999  5.374548  5.130925  4.848030  4.944024
##   58175   3.943567  3.294875 -1.767924 -1.880302  2.993289  3.357379
##         Samples
## Tags        JMS8-5  JMS9-P7c  JMS9-P8c
##   497097  3.306782 -3.204336 -5.287282
##   27395   3.918878  4.345642  4.132678
##   18777   5.080061  5.757404  5.150470
##   21399   5.158292  5.036933  4.987679
##   58175  -2.114104  3.142621  3.523290
## 14160 more rows ...
## 
## $weights
##           [,1]      [,2]      [,3]     [,4]      [,5]      [,6]      [,7]
## [1,]  1.183974  1.183974 20.526779 20.97747  1.773562  1.217142 21.125740
## [2,] 20.879554 26.561871 31.596323 29.66102 32.558344 26.745293 29.792090
## [3,] 28.003202 33.695540 34.845507 34.45673 35.148529 33.550527 34.517259
## [4,] 27.670233 29.595778 34.901302 34.43298 34.841349 33.159425 34.493456
## [5,] 19.737381 18.658333  3.184207  2.62986 24.191635 24.014937  2.648747
##           [,8]      [,9]
## [1,]  1.183974  1.183974
## [2,] 21.900102 17.150677
## [3,] 31.440457 25.228325
## [4,] 26.136796 24.502247
## [5,] 13.149278 14.351930
## 14160 more rows ...
## 
## $design
##   Basal LP ML laneL006 laneL008
## 1     0  1  0        0        0
## 2     0  0  1        0        0
## 3     1  0  0        0        0
## 4     1  0  0        1        0
## 5     0  0  1        1        0
## 6     0  1  0        1        0
## 7     1  0  0        1        0
## 8     0  0  1        0        1
## 9     0  1  0        0        1
## attr(,"assign")
## [1] 1 1 1 2 2
## attr(,"contrasts")
## attr(,"contrasts")$group
## [1] "contr.treatment"
## 
## attr(,"contrasts")$lane
## [1] "contr.treatment"

Fitting linear models for comparisons of interest

Now that we’ve normalized and filtered our data, we can fit linear models. Review the original article and earlier voom/limma papers for more details here.

vfit <- lmFit(v, design)
vfit <- contrasts.fit(vfit, contrasts = contr.matrix)
efit <- eBayes(vfit)
plotSA(efit)

Examining the number of DE genes

Get counts of DE genes:

# Get all the results
summary(decideTests(efit))
##    BasalvsLP BasalvsML LPvsML
## -1      4127      4338   2895
## 0       5740      5655   8825
## 1       4298      4172   2445
# Add a filter for logFC
tfit <- treat(vfit, lfc = 1)
dt <- decideTests(tfit)
summary(dt)
##    BasalvsLP BasalvsML LPvsML
## -1      1417      1512    203
## 0      11030     10895  13780
## 1       1718      1758    182
# Find commonly DE genes
de.common <- which(dt[,1] != 0 & dt[,2] != 0)
length(de.common)
## [1] 2409
head(tfit$genes$SYMBOL[de.common], n = 20)
##  [1] "Xkr4"          "Rgs20"         "Cpa6"          "Sulf1"        
##  [5] "Eya1"          "Msc"           "Sbspon"        "Pi15"         
##  [9] "Crispld1"      "Kcnq5"         "Ptpn18"        "Arhgef4"      
## [13] "2010300C02Rik" "Aff3"          "Npas2"         "Tbc1d8"       
## [17] "Creg2"         "Il1r1"         "Il18r1"        "Il18rap"
# Make a Venn diagram and write results to a file
vennDiagram(dt[,1:2], circle.col = c("turquoise", "salmon"))
write.fit(tfit, dt, file = "results.txt")

Examining individual DE genes from top to bottom

Generate tables of DE genes for the two comparisons:

basal.vs.lp <- topTreat(tfit, coef = 1, n = Inf)
basal.vs.ml <- topTreat(tfit, coef = 2, n = Inf)
head(basal.vs.lp)
##        ENTREZID SYMBOL TXCHROM     logFC  AveExpr         t      P.Value
## 12759     12759    Clu   chr14 -5.442877 8.857907 -33.44429 3.990899e-10
## 53624     53624  Cldn7   chr11 -5.514605 6.296762 -32.94533 4.503694e-10
## 242505   242505  Rasef    chr4 -5.921741 5.119585 -31.77625 6.063249e-10
## 67451     67451   Pkp2   chr16 -5.724823 4.420495 -30.65370 8.010456e-10
## 228543   228543   Rhov    chr2 -6.253427 5.486640 -29.46244 1.112729e-09
## 70350     70350  Basp1   chr15 -6.073297 5.248349 -28.64890 1.380545e-09
##           adj.P.Val
## 12759  2.703871e-06
## 53624  2.703871e-06
## 242505 2.703871e-06
## 67451  2.703871e-06
## 228543 2.703871e-06
## 70350  2.703871e-06

Useful graphical representations of differential expression results

plotMD(tfit, column = 1, status = dt[,1], 
       main = colnames(tfit)[1], xlim = c(-8,13))

Make an interactive plot with Glimma:

glMDPlot(tfit, coef = 1, status = dt, main = colnames(tfit)[1],
         id.column = "ENTREZID", counts = x$counts, 
         groups = group, launch = TRUE)

Create a heatmap of DE genes:

basal.vs.lp.topgenes <- basal.vs.lp$ENTREZID[1:100]
i <- which(v$genes$ENTREZID %in% basal.vs.lp.topgenes)
mycol <- colorpanel(1000,"blue","white","red")
heatmap.2(v$E[i,], scale = "row",
   labRow = v$genes$SYMBOL[i], labCol = group,
   col = mycol, trace = "none", density.info = "none", 
   margin = c(8,6), lhei = c(2,10), dendrogram = "column")

Gene set testing with camera

Load gene sets and perform gene set testing:

# Load mouse gene sets
load(url("http://bioinf.wehi.edu.au/software/MSigDB/mouse_c2_v5p1.rdata")) 
# Generate identifiers
idx <- ids2indices(Mm.c2, identifiers = rownames(v)) 
cam.BasalvsLP <- camera(v, idx, design, contrast = contr.matrix[,1]) 
head(cam.BasalvsLP,5)
##                                             NGenes Direction       PValue
## LIM_MAMMARY_STEM_CELL_UP                       739        Up 1.134757e-18
## LIM_MAMMARY_STEM_CELL_DN                       630      Down 1.569957e-15
## ROSTY_CERVICAL_CANCER_PROLIFERATION_CLUSTER    163        Up 1.437987e-13
## SOTIRIOU_BREAST_CANCER_GRADE_1_VS_3_UP         183        Up 2.181862e-13
## LIM_MAMMARY_LUMINAL_PROGENITOR_UP               87      Down 6.734613e-13
##                                                      FDR
## LIM_MAMMARY_STEM_CELL_UP                    5.360590e-15
## LIM_MAMMARY_STEM_CELL_DN                    3.708238e-12
## ROSTY_CERVICAL_CANCER_PROLIFERATION_CLUSTER 2.264351e-10
## SOTIRIOU_BREAST_CANCER_GRADE_1_VS_3_UP      2.576779e-10
## LIM_MAMMARY_LUMINAL_PROGENITOR_UP           6.362863e-10
cam.BasalvsML <- camera(v, idx, design, contrast = contr.matrix[,2]) 
head(cam.BasalvsML,5)
##                                             NGenes Direction       PValue
## LIM_MAMMARY_STEM_CELL_UP                       739        Up 5.090937e-23
## LIM_MAMMARY_STEM_CELL_DN                       630      Down 5.132446e-19
## LIM_MAMMARY_LUMINAL_MATURE_DN                  166        Up 8.875174e-16
## LIM_MAMMARY_LUMINAL_MATURE_UP                  180      Down 6.287301e-13
## ROSTY_CERVICAL_CANCER_PROLIFERATION_CLUSTER    163        Up 1.684323e-12
##                                                      FDR
## LIM_MAMMARY_STEM_CELL_UP                    2.404959e-19
## LIM_MAMMARY_STEM_CELL_DN                    1.212284e-15
## LIM_MAMMARY_LUMINAL_MATURE_DN               1.397544e-12
## LIM_MAMMARY_LUMINAL_MATURE_UP               7.425303e-10
## ROSTY_CERVICAL_CANCER_PROLIFERATION_CLUSTER 1.591348e-09
cam.LPvsML <- camera(v, idx, design, contrast = contr.matrix[,3]) 
head(cam.LPvsML,5)
##                                         NGenes Direction       PValue
## LIM_MAMMARY_LUMINAL_MATURE_UP              180      Down 8.497295e-14
## LIM_MAMMARY_LUMINAL_MATURE_DN              166        Up 1.439890e-13
## LIM_MAMMARY_LUMINAL_PROGENITOR_UP           87        Up 3.840915e-11
## REACTOME_RESPIRATORY_ELECTRON_TRANSPORT     91      Down 2.655349e-08
## NABA_CORE_MATRISOME                        222        Up 4.430361e-08
##                                                  FDR
## LIM_MAMMARY_LUMINAL_MATURE_UP           3.401020e-10
## LIM_MAMMARY_LUMINAL_MATURE_DN           3.401020e-10
## LIM_MAMMARY_LUMINAL_PROGENITOR_UP       6.048160e-08
## REACTOME_RESPIRATORY_ELECTRON_TRANSPORT 3.135967e-05
## NABA_CORE_MATRISOME                     4.185805e-05
barcodeplot(efit$t[,3], index = idx$LIM_MAMMARY_LUMINAL_MATURE_UP, 
            index2 = idx$LIM_MAMMARY_LUMINAL_MATURE_DN, 
            main = "LPvsML")

Software Versions

sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: OS X El Capitan 10.11.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] gplots_3.0.1                            
##  [2] Glimma_1.4.0                            
##  [3] Mus.musculus_1.3.1                      
##  [4] TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.0
##  [5] org.Mm.eg.db_3.4.1                      
##  [6] GO.db_3.4.1                             
##  [7] OrganismDbi_1.18.0                      
##  [8] GenomicFeatures_1.28.4                  
##  [9] GenomicRanges_1.28.4                    
## [10] GenomeInfoDb_1.12.2                     
## [11] AnnotationDbi_1.38.1                    
## [12] IRanges_2.10.2                          
## [13] S4Vectors_0.14.3                        
## [14] Biobase_2.36.2                          
## [15] BiocGenerics_0.22.0                     
## [16] edgeR_3.18.1                            
## [17] limma_3.32.3                            
## [18] stringr_1.2.0                           
## [19] RColorBrewer_1.1-2                      
## [20] knitr_1.16                              
## [21] dplyr_0.7.1                             
## [22] purrr_0.2.2.2                           
## [23] readr_1.1.1                             
## [24] tidyr_0.6.3                             
## [25] tibble_1.3.3                            
## [26] ggplot2_2.2.1                           
## [27] tidyverse_1.1.1                         
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-131               bitops_1.0-6              
##  [3] matrixStats_0.52.2         lubridate_1.6.0           
##  [5] bit64_0.9-7                httr_1.2.1                
##  [7] rprojroot_1.2              tools_3.4.1               
##  [9] backports_1.1.0            R6_2.2.2                  
## [11] KernSmooth_2.23-15         DBI_0.7                   
## [13] lazyeval_0.2.0             colorspace_1.3-2          
## [15] mnormt_1.5-5               bit_1.1-12                
## [17] compiler_3.4.1             graph_1.54.0              
## [19] rvest_0.3.2                xml2_1.1.1                
## [21] DelayedArray_0.2.7         rtracklayer_1.36.4        
## [23] caTools_1.17.1             scales_0.4.1              
## [25] psych_1.7.5                RBGL_1.52.0               
## [27] digest_0.6.12              Rsamtools_1.28.0          
## [29] foreign_0.8-69             rmarkdown_1.6             
## [31] XVector_0.16.0             pkgconfig_2.0.1           
## [33] htmltools_0.3.6            highr_0.6                 
## [35] rlang_0.1.1                readxl_1.0.0              
## [37] RSQLite_2.0                BiocInstaller_1.26.0      
## [39] bindr_0.1                  jsonlite_1.5              
## [41] gtools_3.5.0               BiocParallel_1.10.1       
## [43] RCurl_1.95-4.8             magrittr_1.5              
## [45] GenomeInfoDbData_0.99.0    Matrix_1.2-10             
## [47] Rcpp_0.12.11               munsell_0.4.3             
## [49] stringi_1.1.5              yaml_2.1.14               
## [51] SummarizedExperiment_1.6.3 zlibbioc_1.22.0           
## [53] plyr_1.8.4                 grid_3.4.1                
## [55] blob_1.1.0                 gdata_2.18.0              
## [57] forcats_0.2.0              lattice_0.20-35           
## [59] Biostrings_2.44.1          haven_1.1.0               
## [61] hms_0.3                    locfit_1.5-9.1            
## [63] codetools_0.2-15           reshape2_1.4.2            
## [65] biomaRt_2.32.1             XML_3.98-1.9              
## [67] glue_1.1.1                 evaluate_0.10.1           
## [69] modelr_0.1.0               cellranger_1.1.0          
## [71] gtable_0.2.0               assertthat_0.2.0          
## [73] broom_0.4.2                GenomicAlignments_1.12.1  
## [75] memoise_1.1.0              bindrcpp_0.2

References

Law, Charity W, Monther Alhamdoosh, Shian Su, Gordon K Smyth, and Matthew E Ritchie. 2016. “RNA-seq Analysis Is Easy as 1-2-3 with Limma, Glimma and EdgeR.” F1000Research 5 (17~jun): 1408. doi:10.12688/f1000research.9005.2.